NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Homogeneity Tests of Covariance for High-Dimensional Functional Data with Applications to Event Segmentation

https://doi.org/10.1111/biom.13844

Zhong, Ping-Shou (February 2023, Biometrics)

Abstract We consider inference problems for high-dimensional (HD) functional data with a dense number of T repeated measurements taken for a large number of p variables from a small number of n experimental units. The spatial and temporal dependence, high dimensionality, and dense number of repeated measurements pose theoretical and computational challenges. This paper has two aims; our first aim is to solve the theoretical and computational challenges in testing equivalence among covariance matrices from HD functional data. The second aim is to provide computationally efficient and tuning-free tools with guaranteed stochastic error control. The weak convergence of the stochastic process formed by the test statistics is established under the “large p, large T, and small n” setting. If the null is rejected, we further show that the locations of the change points can be estimated consistently. The estimator's rate of convergence is shown to depend on the data dimension, sample size, number of repeated measurements, and signal-to-noise ratio. We also show that our proposed computation algorithms can significantly reduce the computation time and are applicable to real-world data with a large number of HD-repeated measurements (e.g., functional magnetic resonance imaging (fMRI) data). Simulation results demonstrate both the finite sample performance and computational effectiveness of our proposed procedures. We observe that the empirical size of the test is well controlled at the nominal level, and the locations of multiple change points can be accurately identified. An application to fMRI data demonstrates that our proposed methods can identify event boundaries in the preface of the television series Sherlock. Code to implement the procedures is available in an R package named TechPhD.
more » « less
Unified Tests for Nonparametric Functions in RKHS With Kernel Selection and Regularization

https://doi.org/10.5705/ss.202020.0339

He, Tao; Zhong, Ping-Shou; Cui, Yuehua; Mandrekar, Vidyadhar (January 2023, Statistica Sinica)

Full Text Available
Asymptotic independence of spiked eigenvalues and linear spectral statistics for large sample covariance matrices

https://doi.org/10.1214/22-AOS2183

Zhang, Zhixiang; Zheng, Shurong; Pan, Guangming; Zhong, Ping-Shou (August 2022, The Annals of Statistics)

Full Text Available
Toward Systematic Considerations of Missingness in Visual Analytics

https://doi.org/10.1109/VIS54862.2022.00031

Sun, Maoyuan; Ma, Yue; Wang, Yuanxin; Li, Tianyi; Zhao, Jian; Liu, Yujun; Zhong, Ping-Shou (October 2022, 2022 IEEE Visualization and Visual Analytics (VIS))

Full Text Available
Multivariate analysis of variance and change points estimation for high‐dimensional longitudinal data

https://doi.org/10.1111/sjos.12460

Zhong, Ping‐Shou; Li, Jun; Kokoszka, Piotr (April 2020, Scandinavian Journal of Statistics)

Full Text Available
Homogeneity tests of covariance matrices with high-dimensional longitudinal data

https://doi.org/10.1093/biomet/asz011

Zhong, Ping-Shou; Li, Runze; Santo, Shawn (May 2019, Biometrika)

Summary This paper deals with the detection and identification of changepoints among covariances of high-dimensional longitudinal data, where the number of features is greater than both the sample size and the number of repeated measurements. The proposed methods are applicable under general temporal-spatial dependence. A new test statistic is introduced for changepoint detection, and its asymptotic distribution is established. If a changepoint is detected, an estimate of the location is provided. The rate of convergence of the estimator is shown to depend on the data dimension, sample size, and signal-to-noise ratio. Binary segmentation is used to estimate the locations of possibly multiple changepoints, and the corresponding estimator is shown to be consistent under mild conditions. Simulation studies provide the empirical size and power of the proposed test and the accuracy of the changepoint estimator. An application to a time-course microarray dataset identifies gene sets with significant gene interaction changes over time.
more » « less
Full Text Available

Search for: All records